Just Relax and Come Clustering! A Convexification of k-Means Clustering, Report no. LiTH-ISY-R-2992
نویسندگان
چکیده
k-means clustering is a popular approach to clustering. It is easy to implement and intuitive but has the disadvantage of being sensitive to initialization due to an underlying non-convex optimization problem. In this paper, we derive an equivalent formulation of k-means clustering. The formulation takes the form of a `0-regularized least squares problem. We then propose a novel convex, relaxed, formulation of k-means clustering. The sum-ofnorms regularized least squares formulation inherits many desired properties of k-means but has the advantage of being independent of initialization.
منابع مشابه
A Hybrid Data Clustering Algorithm Using Modified Krill Herd Algorithm and K-MEANS
Data clustering is the process of partitioning a set of data objects into meaning clusters or groups. Due to the vast usage of clustering algorithms in many fields, a lot of research is still going on to find the best and efficient clustering algorithm. K-means is simple and easy to implement, but it suffers from initialization of cluster center and hence trapped in local optimum. In this paper...
متن کاملClustering using sum-of-norms regularization; with application to particle filter output computation, Report no. LiTH-ISY-R-2993
We present a novel clustering method, SON clustering, formulated as a convex optimization problem. The method is based on over-parameterization and uses a sum-of-norms regularization to control the trade-o between the model t and the number of clusters. Hence, the number of clusters can be automatically adapted to best describe the data, and need not to be speci ed a priori. We apply SON cluste...
متن کاملModification of the Fast Global K-means Using a Fuzzy Relation with Application in Microarray Data Analysis
Recognizing genes with distinctive expression levels can help in prevention, diagnosis and treatment of the diseases at the genomic level. In this paper, fast Global k-means (fast GKM) is developed for clustering the gene expression datasets. Fast GKM is a significant improvement of the k-means clustering method. It is an incremental clustering method which starts with one cluster. Iteratively ...
متن کاملAssessment of the Performance of Clustering Algorithms in the Extraction of Similar Trajectories
In recent years, the tremendous and increasing growth of spatial trajectory data and the necessity of processing and extraction of useful information and meaningful patterns have led to the fact that many researchers have been attracted to the field of spatio-temporal trajectory clustering. The process and analysis of these trajectories have resulted in the extraction of useful information whic...
متن کاملPersistent K-Means: Stable Data Clustering Algorithm Based on K-Means Algorithm
Identifying clusters or clustering is an important aspect of data analysis. It is the task of grouping a set of objects in such a way those objects in the same group/cluster are more similar in some sense or another. It is a main task of exploratory data mining, and a common technique for statistical data analysis This paper proposed an improved version of K-Means algorithm, namely Persistent K...
متن کامل